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Abstract — This work proposes an algebraic model for classical 
information theory. We first give an algebraic model of proba- 
bility theory. Information theoretic constructs are based on this 
model. In addition to theoretical insights provided by our model 
one obtains new computational and analytical tools. Several 
important theorems of classical probability and information 
theory are presented in the algebraic framework. 



I. Introduction 

The present paper reports a brief synopsis of our work on 
an algebraic model of classical information theory based on 
operator algebras. Let us recall a simple model of a commu- 
nication system proposed by Shanon IISha48l . This model has 
essentially four components: source, channel, encoder/decoder 
and receiver Some amount of noise affects every stage of 
the operation and the behavior of components are generally 
modeled as stochastic processes. In this work our primary 
focus will be on discrete processes. A discrete source can be 
viewed as a generator of a countable set of random variables. 
In a communication process the source generates sequence 
of random variables. Then it is sent through the channel 
(with encoding/decoding) and the output at the receiver is 
another sequence of random variables. Thus, the concrete 
objects or observables, to use the language of quantum theory, 
are modeled as random variables. The underlying probability 
space is primarily used to define probability distributions or 
states associated with the relevant random variables. In the 
algebraic approach we directly model the observables. Since 
random variables can be added and multiplied Qthey constitute 
an algebra. This is our starting point. In fact, the algebra of 
random variables have a richer structure called a C* algebra. 
Starting with a C* algebra of observables we can define 
most important concepts in probability theory in general and 
information theory in particular A natural question is: why 
should we adopt this algebraic approach? We discuss the 
reasons below. 

First, it seems more appropriate to deal with the "concrete" 
quantities, viz. observables and their intrinsic structure. The 
choice of underlying probability space is somewhat arbitrary 
as a comparison of standard textbooks on information theory 
nCT99l . IICK8II reveals. Moreover, from the algebra of ob- 
servables we can recover particular probability spaces from 
representations of the algebra. Second, some constraints, may 
have to be imposed on the set of random variables. In security 
protocols different participants have access to different sets of 
observables and may assign different probability structures. In 
this case, the algebraic approach seems more natural: we have 
to study different subalgebras. Third, the algebraic approach 

'We assume that they are real or complex valued. 



gives us new theoretical insights and computational tools. This 
will be justified in the following sections. Finally, and this 
was our original motivation, the algebraic approach provides 
the basic framework for a unified approach to classical and 
quantum information. All quantum protocols have some clas- 
sical components, e.g. classical communication, "coin-tosses" 
etc. But the language of the two processes, classical and 
quantum, seem quite different. In the former we are dealing 
with random variables defined on one or more probability 
spaces where as in the latter we are processing quantum states 
which also give complete information about the measurement 
statistics of quantum observables. The algebraic framework is 
eminently suitable for bringing together these somewhat dis- 
parate viewpoints. Classical observables are simply elements 
that commute with every element in the algebra. 

The connection between operator algebras and information 
theory — classical and quantum — have appeared in the sci- 
entific literature since the beginnings of information theory 
and operator algebras — both classical and quantum (see e.g. 
IIUme62l . | |Seg60| , IIAra75l . |Key02| , IIBKK07L IIKW06n . 
Most previous work focus on some aspects of information 
theory like the noncommutative generalizations of the con- 
cepts of entropy. There does not appear to be a unified and 
coherent approach based on intrinsically algebraic notions. 
The construction of such a model is one of the goals of the 
paper As probabilistic concepts play such an important role 
in the development of information theory we first present an 
algebraic approach to probability. I. E. Segal | Seg54) first 
proposed such an algebraic approach model of probability 
theory. Later Voiculescu IIVDN92I developed noncommutative 
or "free probability" theory. We believe several aspects of our 
approach are novel and yield deeper insights to information 
processes. In this summary, we have omitted most proofs or 
give only brief outlines. The full proofs can be found in our 
larXiv submissi on IIPBI . A brief outline of the paper follows. 

In Section In] we give the basic definitions of the C* alge- 
bras. This is followed by an account of probabilistic concepts 
from an algebraic perspective. In particular, we investigate 
the fundamental notion of independence and demonstrate how 
it relates to the algebraic structure. One important aspect in 
which our approach seems novel is the treatment of proba- 
bility distribution functions. In Section |lll] we give a precise 
algebraic model of information/communication system. The 
fundamental concept of entropy is introduced. We also define 
and study the crucial notion of a channel as a (completely) 
positive map. In particular, the channel coding theorem is 
presented as an approximation result. Stated informally: Every 
channel other than the useless ones can be approximated by a 
lossless channel under appropriate coding. We conclude the 
paper with some comments and discussions. 
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II. C* Algebras and Probability 

A Banach algebra A is a complete normed algebra IIRud87l . 
IIKR97II . That is, A is an algebra over real (R) or com- 
plex numbers (C), for every x e A the norm ||a:;|| ^ 
is defined satisfying the usual properties and every Cauchy 
sequence converges in the norm. A C* algebra iJ is a Banach 
algebra [|KR9?il with an anti-linear involution * {x** = x and 
(x + cy)* = x* + cy*, x,y e B and c e C) such that 
= ||a;p and (xy)* = y*x*'^x,y e B. This implies 
that ||x|| = We often assume that the unit I e B. The 

fundamental Gelfand-Naimark-Segal (GNS) theorem states 
that every C* algebra can be isometrically embedded in some 
C{H), the set of bounded operators on a Hilbert space of H. 
The spectrum of an element x e B is defined by sp(a::) = {c 6 
C : x—cl invertible }. The spectrum is a nonempty closed and 
bounded set and hence compact. An element x is self-adjoint 
if cc = x*, normal if x*x = xx* and positive (strictly positive) 
if X is self-adjoint and sp(a;) <z [0, oo)((0, oo)). A self-adjoint 
element has a real spectrum and conversely. Since x = xi +ix2 
with xi = {x + x*)/2 and xi = {x + x*)/2i any element of 
a C* algebra can be decomposed into self-adjoint "real" and 
"imaginary" parts. The positive elements define a partial order 
on j4: a; ^ 1/ iff y — a; ^ (positive). A positive element a has 
a unique square-root ^Ja such that ^Ja ^ and (\/a)^ = a. 
If X is self-adjoint, x^ ^ and |a;| = Va?. A self-adjoint 
element x has a decomposition x = x+ — x_ into positive and 
negative parts where x+ = {\x\+x)/2 and a;_ = (|a;|— a;)/2) 
are positive. An element p e B is a projection if p is self- 
adjoint and p^ = p. Given two C* -algebras A and B a 
homomorphism F is a linear map preserving the product and 
* structures. A homomorphism is positive if it maps positive 
elements to positive elements. A (linear) functional on A is a 
linear map A ^ C. A positive functional w such that = 1 
is called a state. The set of states G is convex. The extreme 
points are called pure states and G is the convex closure of 
pure states (Krein-Milknan theorem). A set S c A is called a 
subalgebra if it is a C* algebra with the inherited product. A 
subalgebra is called unital if it contains the identity of A. Our 
primary interest will be on abelian or commutative algebras. 
The basic representation theorem (Gelfand-Naimark) IK R97I 
states that: An abelian G* algebra with unity is isomorphic to 
the algebra G{X) continuous complex-valued functions on a 
compact Hausdorff space X. 

Now let X = {a !,...,«„} be a finite set with discreet 
topology. Then A = C{X) is the set of all functions X 
C. The algebra G{X) can be considered as the algebra of 
(complex) random variables on the finite probability space X. 
Let Xi{aj) = Sij, i,j = 1, . . . ,ti. Here 6ij = 1 if i = j and 
otherwise. The functions Xi e A form a basis for A. Their 
multiplication table is particularly simple: They 
also satisfy J]- Xi = 1. These are projections in A. They are 
orthogonal in the sense that XiXj = for i j. We call any 
basis consisting of elements of norm 1 with distinct elements 
orthogonal atomic. A set of linearly independent elements [yi] 
satisfying j/i = 1 is said to be complete. The next theorem 
gives us the general structure of any finite-dimensional algebra. 

Theorem 1. Let A be a finite-dimensional abelian G* al- 



gebra. Then there is a unique (up to permutations) complete 
atomic basis B = {xi, . . . , Xn}. That is, the basis elements 
satisfy 

X ^ — Xi ^ X I J — ^ ' II II — and x ^ — 1 , ( I ) 

i 

Let X = '^j^a.iXi 6 A Then sp(a:;) = {ai} and hence \\x\\ = 
maxj{|ai|}. 

We next describe an important construction for G* algebras. 
Given two C* algebras A and B, the tensor product A(S) B 
is defined as follows. As a set it consists of aU finite linear 
combinations of symbols of the form {x (S)y : x e A,y e B} 
subject to the conditions that the map {x, y) x®y is bilinear 
in each variable. Hence, if {xi} and {j/j} are bases for A and 
B respectively then {xi®yj} is a basis for A®B. The linear 
space A®B becomes an algebra by defining {x®y)(u®z) = 
xu ® yz and extending by bilinearity. The * is defined by 
{x®y)* = x*®y* and extending anti- linearly. We will define 
the norm in a more general setting. Our basic model will be 
an infinite tensor product of finite dimensional C* algebras 
which we present next. 

Let Ak, k = 1,2,..., be finite dimensional abelian C* 
algebras with atomic basis Bk = {xki, ■ ■ . , Xkn^ }■ Let B^^- be 
the set consisting of all infinite strings of the form z^^ ^z^^®- • • 
where all but a finite number (> 0) of Zi^s are equal to 1 and 
if some Zi^, ^ 1 then Zi^ e Bk. Let 21 = (S)f±iAi be the vector 
space with basis B'^^ such that Zij^z^^®- • -^Zi^®- ■ ■ is linear 
in each factor separately. We define a product in 21 as follows. 
First, for elements of B'^ -. {zi^ ®Zi2 ® - ■ • )(2:-^ ® • • • ) = 
(zij^z'^ ® Zi2^i, ® ■ • ■ ) We extend the product to whole of 
21 by Hnearity. Next define a norm by: || 2- - 

® ■ ' ■ II = sup{|aiji2-- |}- is an atomic basis. It follows 
that 2t is an abelian normed algebra. We define *-operation 

Zi2 ® ■ ■ ■ It follows that for a; e 21, ||a;a;*|| = ||xp. Finally, we 
complete the norm IIKR97II and call the resulting C* algebra 21. 
With these definitions 21 is a G* algebra. We call a C* algebra 
B of finite type if it is either finite dimensional or infinite 
tensor product of finite-dimensional algebras. An important 
special case is when all the factor algebras Ai = A. We 
then write the infinite tensor product C* algebra as (X) ^ A. 
Intuitively, the elements of an atomic basis B'^ of (X) " A 
correspond to strings from an alphabet (represented by the 
basis B). Of particular interest is the 2-dimensional algebra 
D corresponding to a binary alphabet. 

The next step is to describe the state space. Given a G* 
subalgebra V A the set of states of V will be denoted by 
S^{V). Let 21 = ®'{:^iAi denote the infinite tensor product of 
finite-dimensional algebras Ai. An infinite product state of 21 
is a functional of the form = ® ci;2 ® • • • such that cji e 
S^(Ai) This is indeed a state of 21 for if a/j = zi ® Z2 ® • • • ® 
Zfe ® 1 ® 1 ■ • ■ 6 21 then ?L{a) = wi(zi)w2(22) ■ • ■t^fc(zfc)i ^ 
finite product. A general state on 2( is a convex combination 
of product states like il. Finally, we discuss another useful 
construction in a G* algebra A. If /(z) is an analytic function 
whose Taylor series Xin=o'^n(-^ ~ converges in a region 
|z — c| < R. Then the series Xin=o(^ ~ cl)" converges and it 
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makes sense to talk of analytic functions on a C* algebra. If 
we have an atomic basis {xi, 2:2, . . . } in an abelian C* algebra 
then the functions are particularly simple in this basis. Thus 
if 2; = y^^ttiXi then f {x) = Yjif{0'i)^i provided that f{a{) 
are defined in an appropriate domain. 

We gave a brief description of C* algebras. We now 
introduce an algebraic model of probability which is used 
later to model communication processes. In this model we 
treat random variables as elements of a C* algebra. The 
probabilities are introduced via states. A classical observable 
algebra is a complex abelian C* algebra A. We can restrict 
our attention to real algebras whenever necessary. The Riesz 
representation theorem IIRud87l makes it possible identify lo 
with some probability measure. A probability algebra is a pair 
(j4, S) where A is an observable algebra and S cz y{A) is a 
set of states. A probability algebra is defined to be fixed if S 
contains only one state. 

Let w be a state on an abelian C* algebra A. Call two elements 
x,y e A uncorrelated in the state lo if uj{xy) = a;(a;)a;(j/). 
This definition depends on the state: two uncorrelated elements 
can be correlated in some other state oj'. A state cj is called 
multiplicative if ijj{xy) = uj{x)ijj{y) for all x,y e A. The 
set of states, is convex. The extreme points of are 
called pure states. In the case of abelian C* algebras a state 
is pure if and only of it is multiplicative IIKR97I . Thus, in 
a pure state any two observables are uncorrelated. This is 
not generally true in the non-abelian quantum case. Now we 
can introduce the important notion of independence. Given 
5 <z ^ let A{S) denote the subalgebra generated by S 
(the smallest subalgebra of A containing S). Two subsets 
81,82 c A are defined to be independent if all the pairs 
{{xi,X2) ■ xi 6 A{8i),X2 6 A{82)} are uncorrelated. 
As independence and correlation depend on the state we 
sometimes write cj-independent/uncorrelated. Independence is 
a much stronger condition than being uncorrelated. The next 
theorem states the structural implications of independence. 

Theorem 2. Two sets of observables 81 , 6*2 in a finite 
dimensional abelian C* algebra A are independent in a 
state Lu if and only if for the subalgebras A{8i) and ^(6*2) 
generated by 81 and 82 respectively there exist states uJi 6 
y{A{Si)), UJ2 6 y{A{82)) such that {A{8i)®A{82), {uji® 
UI2]) is a cover of (A(S'iS'2), w') where A{8i82) is the 
subalgebra generated by {81,82} and u)' is the restriction 
ofuj to A{8iS2). 

We thus see the relation between independence and (tensor) 
product states in the classical theory. Next we show how 
one can formulate another important concept, distribution 
function (d.f) in the algebraic framework. We restrict our 
analysis to C* algebras of finite type. The general case is 
more delicate and is defined using approximate identities in 
subalgebras in OPBI . The idea is that we approximate indicator 
functions of sets by a sequence of elements in the algebra. In 
the case of finite type algebras the sequence converges to a 
projection operator J5. Thus, if we consider a representation 
where the elements of A are functions on some finite set 
F then Jg is precisely the indicator function of the set 
8' = {c : Xi{c) —ti = : ce F and i = 1, . . . ,n}. The set 8' 



corresponds to the subalgebra {8t)a and J5, a projection in 
A, acts as identity in {8t)a- From the notion of distribution 
functions we can define now probabilities Pr{a ^ x ^ b) in 
the algebraic context. We can now formulate problems in any 
discrete stochastic process in finite dimensions. The algebraic 
method actually provides practical tools besides theoretical 
insights as the example of "waiting time" shows jPBl . Now 
we consider the algebraic formulation of a basic limit theorem 
of probability theory: the weak law of large numbers. From 
information theory perspective it is perhaps the most useful 
limit theorem. Let Xi , X2, ■ ■ ■ , Xn be independent, identically 
distributed (i.i.d) bounded random variables on a probability 
space n with probability measure P. Let p. be the mean of 
Xi. Recall the Weak law of large numbers. Given e > 

lim P(\8^ = ^'+---+^'' -p\>e)=0 

We have an algebraic version of this important result. 

Theorem 3 (Law of large numbers (weak)). If xx, . . . ,Xn, . ■ . 

are lo- 

independent self-adjoint elements in an observable algebra 
and Lo{x^) = oj{x^) for all positive integers i,jandk 
(identically distributed) then 

lim uj{\ ^ — — ——jil'^) = where /i = uj{xi) and k > 

n— >'X' n 

Using the algebraic version of Chebysev inequality the 
above result implies the following. Let xi,. . . ,Xn and p be 
as in the Theorem and set s„ = {xi + • • • + Xn)/n. Then 
for any e > there exist uq such that for all n > no 

P{\sn-p\ >(-) <e 

III. Communication and Information 

We now come to our original theme: an algebraic frame- 
work for communication and information processes. Since our 
primary goal is the modeling of information processes we refer 
to the simple model of communication in the Introduction and 
model different aspects of it. In this work we will only deal 
with sources with a finite alphabet. 

Definition. A source is a pair 5^ = (B, 17) where B is an 
atomic basis of a finite-dimensional abelian C* algebra A 
and VL is a state in (X)^~ A. 

This definition abstracts the essential properties of a source. 
The basis B is called the alphabet. A typical output of the 
source is of the form xi ® 2:2 ® • • • ® a;fc ® 1 ® • • • e B"'^ , the 
infinite product basis of (X) ' A. We identify Xk = 1®- • 
a;fc ® 1 ® • • • with the fcth signal. If these are independent then 
Theorem |2] tells us that must be product state. Further, if 
the state of the source does not change then O = w ® a; ® • ■ • 
where a; is a state in A. For a such state w define: Ouj = 
^(xi)xt, {xi,..., Xn}, Xi e B We say that O^j is the 
"instantaneous" output of the source in state oj. Let A' be 
another finite-dimensional C* algebra with atomic basis B' 
A source coding is a linear map f : B ^ T = XifeLi ®*'^'- 
Such that for x e B, f{x) = x-^ ® a;-^ ® • ■ ■ ® x-^, r ^ k 
with x[_ 6 B' . Thus each "letter" in the alphabet B is coded 
by "words" of maximum length k from B' . 
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A code f : B ^ T is defined to be prefix-free if for distinct 
members xi,X2 in an atomic basis of B, f'{xi)f'{x2) = 
where /' is the map f : B ^ B' induced by /. That 
is, distinct elements of the atomic basis of B are mapped to 
orthogonal elements. Thus the "code-word" zi ® zi (x) • ■ • g) 
^fc (x) 1 (E) 1 ® • ■ • is not orthogonal to another (x) z( g) 
•••®z^®l(g)l(g)--- with fc ^ m if and only if zi = 
z[, . . . ,Zk = z'j,. The useful Kraft inequality can be proved 
using algebraic techniques. Corresponding to a finite sequence 
ki ^ k2 ^ ■ ■ ■ ^ k„i of positive integers let ai, . . . , a„i be a 
set of prefix-free elements in X!i>i ®'^' such that a,; e 
Further, suppose that each ai is a tensor product of elements 
from B'. Then 

m 

2 n'="-'=- ^ n'^- (2) 

i = l 

This inequality is proved by looking at bounds on dimensions 
of a sequence of orthogonal subspaces. In the following, 
we restrict ourselves to prefix-free codes. Using convexity 
function f{x) = — log x and the Kraft inequality |2] we deduce 
the following. 

Proposition 1 (Noiseless coding). Let he a source with 
output Ouj e A, a finite -dimensional C* algebra with atomic 
basis {a;i,...,x„} (the alphabet). Let g be prefix-free code 
such that g{xi) is a tensor product of ki members of the code 
basis. Then ^(^j kiXi + \ogO^) ^ 

Next we give a simple application of the law of large 
numbers. First define a positive functional Tr on a finite 
dimensional abelian C* algebra A with an atomic basis 
|Xl , . . . , Xfl ] hy It = LxJi -\- ■ ■ ■ -\- LOd where are the dual 
functionals. It is clear that Tr is independent of the choice of 
atomic basis. 

Theorem 4 (Asymptotic Equipartition Property (AEP)). Let 

,5^ be a source with output = XiiLi <^{xi)xi where oj is a 
state on the finite dimensional algebra with atomic basis {xi}. 
Then given e > there is a positive integer uq such that for 
all n > no 

where H = uj(log2{Oi^)) is the entropy of the source and the 
probability distribution is calculated with respect to the state 
iln = w®- • -^cj (n factors) o/®"A IfQ denotes the identity 
in the subalgebra generated by (el — \ log2(®"Oa;) + ?^^^|) + 
then 

Note that the element Q is a projection on the subalgebra 
generated by {el — \ log2(®"C'a;) — nH\) + . It corresponds to 
the set of strings whose probabilities are between 2^"^^^ and 
'pjjg integer Tr(Q) is simply the cardinality of this 

set. 

We now come to the most important part of the commu- 
nication model: the channel. The original paper of Shannon 
characterized channels by a transition probability function. We 
will consider only (discrete) memoryless channel (DMS). A 
DMS channel has an input alphabet X and output alpha- 
bet Y and a channel transformation matrix C{yj\xi) with 



Uj 6 Y and Xi 6 X. Since the matrix C{yj\xi) represents 
the probability that the channel outputs yj on input Xi we 
have Tjj C{yj\xi) = 1 for all i: C(ij) = C{yj\xi) is row 
stochastic. This is the standard formulation. IICK81I . IICT99II . 
We now turn to the algebraic formulation. 
Definition. A DMS channel C = {X, Y, C} where X and Y 
are abelian C* algebras of dimension m and n respectively 
and C : Y ^ X is a unital positive map. The algebras X and 
Y will be called the input and output algebras of the channel 
respectively. Given a state w on X we say that {X,lj) is the 
input source for the channel. 

Sometimes we write the entries of C in the more suggestive 
form Cij = C{yj\xi) where {yj} and {xi} are atomic bases 
for Y and X respectively. Thus C(yj) = ^^CijXi = 
'^■C{yj\xi)xi. Note that in our notation C is an m x n 
matrix. Its transpose Cj^ = C{yj\xi) is the channel ma- 
trix in the standard formulation. We have to deal with the 
transpose because the channel is a map from the output 
alphabet to the input alphabet. This may be counterintuitive 
but observe that any map Y ^ X defines a unique dual map 
S{X) ^{Y), on the respective state spaces. Informally, 
a channel transforms a probability distribution on the input 
alphabet to a distribution on the output. We characterize 
a channel by input/output algebras (of observables) and a 
positive map. Like the source output we now define a useful 
quantity called channel output. Corresponding to the atomic 
basis {yi} of Y let (S>''ynk) be an atomic basis in 
Here i{k) = (iii2 ■ ■ - ik) is a multi-index. Similarly we have 
an atomic basis {®^Xj(k^} for i^'^X. The level-fc channel 
output is defined to be = y»(fe) <S) C^'^^y^k) )- Here 
represents the channel transition probability matrix on 
the fc-fold tensor product corresponding to strings of length 
k. In the DMS case it is simply the fc-fold tensor product 
of the matrix C. The channel output defined here encodes 
most important features of the communication process. First, 
given the input source function I^k = Xii '^*^(^i(fe))^i(fc) '^^e 
output source function is defined by OQk = 1 0Tr^kx{{l ® 
= T,iJ]jC{yt{k)\xj(k))i^''{xj(^k))ytik)- Here, the 
state a)*^ on the output space (g'^F can be obtained via the dual 
L^j'^iy) = C^{ui^){y) = uj^{C^('y)). The formula above is an 
alternative representation which is very similar to the quantum 
case. The joint output of the channel can be considered as the 
combined output of the two terminals of the channel. Thus the 
joint output 

J^Qfc = {l®I^k)Oc =Y^9!'{y,(k) ®Xj(k))yr(k) ®a;j(fc), 

ij 

^^{Vtik) ®Xj(k)) = C{y^k)\xj{k))^{xj^k)) 

(3) 

Let us analyze the algebraic definition of channel given 
above. For simplicity of notation, we restrict ourselves to 
level 1. The explicit representation of channel output is 
Si yj®I]j C!{yi \xj)xj We interpreted this as follows: if on the 
channel's out-terminal y^ is observed then the input could be 
Xj with probabiHty C{yi\xj)oj(xj)/^-C(yi\xj)uj{xj). Now 
suppose that for a fixed i G{yi\xj) = for all j except one 
say, ii. Then on observing yi at the output we are certain 
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that the the input is Xj^. If this is true for all values of 
y then we have an instance of a lossless channel. Given 
1 ^ J ^ « let dj be the set of integers i for which 
C[yi\xj) > 0. If the channel is lossless then {dj} form a 
partition of the set {!,..., to}. The corresponding channel 

output is Oc = Jlj(T,tedj ^iy'^\^j)yi) *s 

extreme is the useless channel in which there is no correlation 
between the input and the output. To define it formally, con- 
sider a channel C = {X, Y, C} as above. The map C induces 
a map C : Y ® X ^ X defined by C'{y x) = xC{y). 
Given a state w on X the dual of the map C defines a state 
Qc onY ® X: ndy O x) = w(C"(y ® x)) = C{y\x)uj{x). 
We call the joint (input-output) state of the channel. A 
channel is useless if Y and X (identified as y ® 1 and 1®X 
resp.) are 17c-independent. It is easily shown that: a channel 
C = {X, y, C} with input source (X, a;) is useless iff the 
matrix dj = C{yj\xi) is of rank 1. The algebraic version 
of the channel coding theorem assures that it is possible to 
approximate, in the long run, an arbitrary channel (excepting 
the useless case) by a lossless one. 

Theorem 5 (Channel coding). Let C be a channel with input 
algebra X and output algebra Y. Let {xi\^^i and {yj}^=i be 
atomic bases for X and Y resp. Given a state lo on X, if the 
channel is not useless then for each k there are subalgebras 
Yk d ®*''y, Xk c O'^X, a map Ck ■ Yk ^ Xk induced by C 
and a lossless channel Lk '■ Yk Xk such that 

lim n{\Oc, - Ol, \) = 0on Tk = Yk<g,Xk 

Here Q = ®'"^-ric and on ^''Y^^'^Y it acts as Q'' = ^'Tic 
where Qc '■s the state induced by the channel and a given 
input state ui. Moreover, if rk = dim(Xk) then R = , 
called transmission rate, is independent of k. 

Let us clarify the meaning of the above statements. The 
theorem simply states that on the chosen set of codewords the 
channel output of Ck induced by the given channel can be 
made arbitrarily close to that of a lossless channel Lk- Since 
a lossless channel has a definite decision scheme for decoding 
the choice of Lk is effectively a decision scheme for decoding 
the original channel's output when the input is restricted to 
our "code-book". This implies probability of error tends to 
it is possible to choose a set of "codewords" which can 
be transmitted with high reliability. The proof of the theorem 
IIPBI uses algebraic arguments only. The theorem guarantees 
"convergence in the mean" in the appropriate subspace which 
implies convergence in probability. For a lossless channel the 
input entropy H{X) is equal to the mutual information. We 
may think of this as conservation of entropy or information 
which justifies the term "lossless". Since it is always the 
case that H{X)-H{X\Y) = I(X,Y) the quantity HiX\Y) 
can be considered the loss due to the channel. The algebraic 
version of the theorem serves two primary purposes. It gives 
us the abelian perspective from which we will seek possible 
extensions to the non-commutative case. Secondly, the channel 
map L can be used for a decoding scheme. Thus we may think 
of a coding-decoding scheme for a given channel as a sequence 
of pairs (Xk,Lk) as above. 



The coding theorems can be extended to more compUcated 
scenarios like ergodic sources and channels with finite mem- 
ory. We will not pursue these issues further here. But we 
are confident that these generalizations can be appropriately 
formulated and proved in the algebraic framework. In the 
preceding sections we have laid the basic algebraic framework 
for classical information theory. Although, we often confined 
our discussion to finite-dimensional algebras corresponding to 
finite sample spaces it is possible to extend it to infinite- 
dimensional algebras of continuous sample spaces. These 
topics will be investigated in the future in the non-commutative 
setting. We will delve deeper into these analogies and aim to 
throw light on some basic issues like quantum Huffman coding 
HBFGLOOI . channel capacities and general no-go theorems 
among others, once we formulate the appropriate models. 
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